The Learnability of Optimality Theory: An Algorithm and Some Basic Complexity Results

نویسنده

  • Bruce Tesar
چکیده

If Optimality Theory (Prince & Smolensky 1991, 1993) is correct, Universal Grammar provides a set of universal constraints which are highly general, inherently conflicting, and consequently rampantly violated in the surface forms of languages. A language’s grammar ranks the universal constraints in a dominance hierarchy, higher-ranked constraints taking absolute priority over lower-ranked constraints, so that violations of a constraint occur in well-formed structures when, and only when, they are necessary to prevent violation of higher-ranked constraints. Languages differ principally in how they rank the universal constraints in their language-specific dominance hierarchies. The surface forms of a given language are structural descriptions of inputs which are optimal in the following sense: they satisfy the universal constraints, or, when these constraints are brought into conflict by an input, they satisfy the highest-ranked constraints possible. This notion of optimality is partly language-specific, since the ranking of constraints is language-particular, and partly universal, since the constraints which evaluate well-formedness are (at least to a considerable extent) universal. In many respects, ranking of universal constraints in Optimality Theory plays a role analogous to parameter-setting in principles-and-parameters theory. Evidence in favor of this Optimality-Theoretic characterization of Universal Grammar is provided elsewhere; most work to date addresses phonology: see Prince & Smolensky 1993 (henceforth, ‘P&S’) and the several dozen works cited therein, notably McCarthy & Prince 1993; initial work addressing syntax includes Grimshaw 1993 and Legendre, Raymond & Smolensky 1993. Here, we investigate the learnability of grammars in Optimality Theory. Under the assumption of innate knowledge of the universal constraints, the primary task of the learner is the determination of the dominance ranking of these constraints which is particular to the target language. We will present a simple and efficient algorithm for solving this problem, assuming a given set of hypothesized underlying forms. (Concerning the problem of acquiring underlying forms, see the discussion of ‘optimality in the lexicon’ in P & S 1993:§9). The fact that surface forms are optimal means that every positive example entails a great number of implicit negative examples: for a given input, every candidate output other than the correct form is ill-formed.1 As a consequence, even a single positive example can greatly constrain the possible grammars for a target language, as we will see explicitly. In §1 we present the relevant principles of Optimality Theory and discuss the special nature of the learning problem in that theory. Readers familiar with the theory may wish to proceed directly to §1.3. In §2 we present the first version of our learning algorithm, initially, through a concrete example; we also consider its (low) computational complexity. Formal specification of the first version of the algorithm and proof of its correctness are taken up in the Appendix. In §3 we generalize the algorithm, identifying a more general core called Constraint Demotion (‘CD’) and then a family of CD algorithms which differ in how they apply this core to the acquisition data. We sketch a proof of the correctness and convergence of the CD algorithms, and of a bound on the number of examples needed to complete learning. In §4 we briefly consider the issue of ties in the ranking of constraints and the case of inconsistent data. Finally, we observe that the CD algorithm entails a Superset Principle for acquisition: as the learner refines the grammar, the set of well-formed structures shrinks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Adapted Multi-objective Meta-heuristic Algorithm for a Flexible Bi-objective Scheduling Problem Based on Physics Theory

  We relax some assumptions of the traditional scheduling problem and suggest an adapted meta-heuristic algorithm to optimize efficient utilization of resources and quick response to demands simultaneously. We intend to bridge the existing gap between theory and real industrial scheduling assumptions (e.g., hot metal rolling industry, chemical and pharmaceutical industries). We adapt and evalua...

متن کامل

A learning problem that is independent of the set theory ZFC axioms

We consider the following statistical estimation problem: given a family F of real valued functions over some domain X and an i.i.d. sample drawn from an unknown distribution P over X , find h ∈ F such that the expectation EP (h) is probably approximately equal to sup{EP (h) : h ∈ F}. This Expectation Maximization (EMX) problem captures many well studied learning problems; in fact, it is equiva...

متن کامل

On a learning problem that is independent of the set theory ZFC axioms

We consider the following statistical estimation problem: given a family F of real valued functions over some domain X and an i.i.d. sample drawn from an unknown distribution P over X , find h ∈ F such that the expectation EP (h) is probably approximately equal to sup{EP (h) : h ∈ F}. This Expectation Maximization (EMX) problem captures many well studied learning problems; in fact, it is equiva...

متن کامل

Learnability in Optimality Theory (Short Version)

A central claim of Optimality Theory is that grammars may differ only in how conflicts among universal well-formedness constraints are resolved: a grammar is precisely a means of resolving such conflicts via a strict priority ranking of constraints. It is shown here how this theory of Universal Grammar yields a highly general Constraint Demotion principle for grammar learning. The resulting lea...

متن کامل

The Aggregating Algorithm and Predictive Complexity

This thesis is devoted to on-line learning. An on-line learning algorithm receives elements of a sequence one by one and tries to predict every element before it arrives. The performance of such an algorithm is measured by the discrepancies between its predictions and the outcomes. Discrepancies over several trials sum up to total cumulative loss. The starting point is the Aggregating Algorithm...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993